Text Mining Using Markov Chains of Variable Length

نویسندگان

  • Björn Hoffmeister
  • Thomas Zeugmann
چکیده

When dealing with knowledge federation over text documents one has to gure out whether or not documents are related by context. A new approach is proposed to solve this problem. This leads to the design of a new search engine for literature research and related problems. The idea is that one has already some documents of interest. These documents are taken as input. Then all documents known to a classical search engine are ranked according to their relevance. For achieving this goal we use Markov chains of variable length. The algorithms developed have been implemented and testing over the Reuters-21578 data set has been performed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of Users’ Web Navigation Behavior using GRPA with Variable Length Markov Chains

With the never-ending growth of Web services and Web-based information systems, the volumes of click stream and user data collected by Web-based organizations in their daily operations has reached enormous proportions. Analyzing such huge data can help to evaluate the effectiveness of promotional campaigns, optimize the functionality of Web-based applications, and provide more personalized cont...

متن کامل

Evaluation of First and Second Markov Chains Sensitivity and Specificity as Statistical Approach for Prediction of Sequences of Genes in Virus Double Strand DNA Genomes

Growing amount of information on biological sequences has made application of statistical approaches necessary for modeling and estimation of their functions. In this paper, sensitivity and specificity of the first and second Markov chains for prediction of genes was evaluated using the complete double stranded  DNA virus. There were two approaches for prediction of each Markov Model parameter,...

متن کامل

The Rate of Rényi Entropy for Irreducible Markov Chains

In this paper, we obtain the Rényi entropy rate for irreducible-aperiodic Markov chains with countable state space, using the theory of countable nonnegative matrices. We also obtain the bound for the rate of Rényi entropy of an irreducible Markov chain. Finally, we show that the bound for the Rényi entropy rate is the Shannon entropy rate.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005